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(57) Abstract 



A system and method for providing natural language interface for a computer system that interprets natural language user input and 
outputs responses using natural language are disclosed. The system (102) includes a natural language agent adapted to ^eive an^^^ interpre 
the natural language uL input and to output an output command and at least one application agent adapted to receive and further mterpret 
the output command from the natural language agent and to output an executable instruction to an application program. The natural language 
agent includes a syntactic parser (I02b) adapted to generate a pai^ sentence from the natural language user >^P"^ a semanUc interpreter 
fl02c) adapted to generate the output command from the parsed sentence, and an agent communicauon manager (102d) adapted to provide 
communication benveen the semantic interpreter. Each application agent may include a semantic task interpreter and at least one application 
wrapper, 
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AD APTTVE NATURAL LANGTIACK INTERFACE 

CROSS-REFERENCE TO RELATED APPLICATIONS 

The present application claims priority to Provisional Patent 
Application Nos. 60/105,428 entitled "Adaptive Natural Language Interface 
for Use in Applications" filed on October 3, 1998, and 60/097,630 entitled 
"Adaptive Personal Assistant (APA)" filed on August 21, 1998. 

STATEMENT OF RIGHTS TO INVENTIO NS MADE UNDER 
FEDERALLY SPONSORED RESEARCH 

The U.S. Government has a paid-up license in certain claims of this 

invention and the right in limited circumstances to require the patent owner to 

license others on reasonable terms as provided for by the terms of DARPA 

Contract Numbers DAAH01-96-C-R241 and DAAH01-99-C-R057. 

RACKOROUND OF THE INVENTION 
L Field of Invention 

The present invention relates generally to an adaptive natural language 
interface for use in applications. More specifically, the present invention 
provides a method for receiving commands, executing received commands 
and adaptively interacting with the user using a natural language interface, 
such as a natural language speech interface. 
2. Description of Related Art 

Making computers more user-fiiendly has long been a goal. More and 
more people, include people in non-technical fields and even young kids, use 
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computers for various purposes, such as personal, school and/or business. 
Computer systems are also handling more complex tasks resulting in more 
increasingly more complex operations. Even conceptually simple tasks 
require users to execute multiple complex steps for completion of the tasks. 
5 Further, when a user switches between different application programs 

or vendors, e.g. from MICROSOFT EXCHANGE to NETSCAPE, the same 
conceptual task requires the operator to leam a new set of steps to complete 
the same task. For example, a conceptually simple task such as finding out 
whether the user has received a certain message, the user must be trained in 

10 the platform-specific graphical user interface of scrolling and the vendor- 

specific method of viewing new mail. As is evident, a conceptually simple 
task may require the user to execute multiple complex steps. 

As the number of users and the complexity of computer systems 
increase, there is an increased need for computer systems and computer 

15 applications that require little or no training for the user to use. There is also 

an increased need for a method to efficiently and productively use, manipulate 
and control computers and applications running on computers. 

Natural or spoken language is an efficient method for people to 
communicate and express commands. For example, voice-recognition method 

20 and software have been developed and are commercially available. Although 

some of these method and software allow the user to speak certain commands 
for the computer to execute, these voice-recognition method and software 
support only a predetermined set of commands at a very low-level of 
abstraction. The user must leam the precise words and syntax that the 
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software can accept. In other words, the voice communication cannot handle 
and interpret high-level, abstract, natural language commands. 

Because natural language is an efficient and easy method for people to 
communicate and express commands, there is a long felt need for a voice- 
based command system and interface that can handle high-level, abstract 
commands and that responds to natural language. 

The Air Force Institute of Technology, MIT Media Lab, Oregon 
Graduate Institute, Microsoft and IBM are examples of groups conducting 
research in the area of spoken language input. See, for example. Ball, 
"Mixing Scripted Interaction with Task-Oriented Language Processing in a 
Conversational Interface," hitemational Conference on Intelligent User 
Interfaces, Jan. 5-8, 1999, Redondo Beach, CA, pg. 101-104. 

U.S. Patent No. 5,748,974 assigned to IBM Corp. describes an 
example of spoken language input and, more specifically, a multimodal 
natural language interface for cross-application tasks. The multimodal natural 
language interface interprets user requests by combining natural language 
input from the user (spoken, typed or handwritten) with information selected 
from an application currently in use by the user to perform a task in another 
auxiliary application for processing. The information is selected by a standard 
technique from the current application. 

Copending U.S. Patent Application Serial No. 08/919,138, assigned to 
the assignee of the present application and incorporated herein in its entirety 
by reference, describes a natural-language speech control method. The 
natural-language speech control method produces a command for controlling 
the operation of a computer from words spoken in a natural language. The 
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method includes processing an audio signal representing the spoken words of 
a user to generate textual digital computer data (e.g. ASCII text), processing 
the textual digital computer data with a natural language syntactic parser to 
produce a parsed sentence that includes a string of words with each word 
5 being associated with a part of speech in the parsed sentence, and generating 

the command from the parsed sentence. 

SUMMARY OF THE INVENTION 

The present invention comprises a method for receiving commands 
and/or adaptively outputting results and responses using a natural language 
interface, such as a natural language speech interface. The method utilizes an 
agent-based architecture comprising a front-end natural language agent and 
one or more application task agents for each class of applications. 

It should be appreciated that the present invention can be implemented 
in numerous ways, including as a process, an apparatus, a system, a device, a 
method, or a computer readable medium such as a computer readable storage 

«^A/-1^t f*^w>'r\y^tAf iirKoT^^iTi nrrioTQm inctnirrfionc nrp Qp.nt nvp.r 

«A WV/XA-l.^*.t.kWA. AAW • » ^A. M.^ * * WUA ^ A W » ' -* — " 

optical or electronic communication lines. Several inventive embodiments of 
the present invention are described below. 

In one embodiment, the natural language interface for a computer 
system includes a natural language agent adapted to receive and interpret the 
natural language user input and to output an output command and at least one 
application agent adapted to receive and fiirther interpret the output command 
from the natural language agent and to output an executable instruction to an 
application program. The natural language agent includes a syntactic parser 
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adapted to generate a parsed sentence from the natural language user input, a 
semantic interpreter adapted to generate the output command from the parsed 
sentence, and an agent communication manager adapted to provide 
communication between the semantic interpreter. Each application agent may 
include a semantic task interpreter adapted to generate the executable 
instruction from the output command of the natural language agent, and at 
least one application wrapper, each wrapper configured to communicate with a 
corresponding application program. 

In another embodiment, a computer readable medium on which are 
stored natural language interface instructions executable on a computer 
processor is disclosed. The natural language interface instmctions generally 
comprises receiving natural language user input, generating a parsed sentence 
from the natural language user input, mapping the parsed sentence into a 
semantic action, and generating an instruction from the semantic action, the 

instmction being executable by an application. 

In yet another embodiment, a method for receiving, interpreting and 

i^*^ori,orro iriTMit ic HicrlocRrl The. mp.thod penerallv 

CA.CL'ULlii^ iiatLAXMi i.c4J.iC)ua.^w l^i^wi, *w — ^ ^ 

comprises receiving natural language user input, generating a parsed sentence 
from the natural language user input, semantically interpreting the parsed 
sentence and generating an output command from the parsed sentence, 
outputting the output command to an application class agent, semantically 
interpreting the output command and generating an executable instmction 
from the output command, and outputting the executable instmction to an 
application program for execution by the appHcation program. 
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The present invention is a method for abstracting complex sequence 
computer operations into a conceptually simple task. The natural language 
interface parses the users' input and semantically maps it into a knowledge 
concept structure. The system then determines which application context 
should be responsible for interpreting and executing that command concept. 
The system utilizes task application wrappers to map the complex application 
tasks to vendor-specific executable tasks. Thus, the natural language interface 
system of the present invention allows users to control multiple desktop 
applications by abstract commands. 

The system of the present invention lowers the barrier to entry to 
computing and greatly increases productivity by combining a spoken language 
system with the ability to handle higher order abstract commands in naturally 
spoken language. The system combines a spoken language interface with a 
knowledge-based semantic interpretation such that semantically equivalent 
abstractions result in the same operation. Syntactic and semantic 
interpretation oif spoken language enable ease of use and complexity 
abstraction and provides the user access to computing through spoken 
language. 

The system and method can be adapted to user preferences with 
feedback using active and passive relevance feedback techniques. Further, the 
present invention may include a natural language based help system in the 
natural language agent and each application class agent that collaborate with 
the user in offering assistance. For example, the system may prompt the user 
for semantically correct input, help the user complete tasks, and remind the 
user on tasks that need to be done. 



00/1 1 571 PCT/US99/m55 

The system of the present invention may be utilized and is compatible 
with existing software applications and platforms. The system uses a set of 
application class agents and wrappers that provide interface between the 
application class agent and different applications in the class. Each agent 
works with a class of applications, such as electronic mail, and conmiimicates 
with specific applications through application wrappers. Thus, with a modular 
distributed agent architecture, the system and method of the present invention 
is extendable to multiple applications and is scalable to large sets of networked 
computer systems. 

These and other features and advantages of the present invention will 
be presented in more detail in the following detailed description and the 
accompanying figures which illustrate by way of example the principles of the 
invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG* 1 is a schematic illustration of the system and method of the 
present invention comprising an adaptive natural language interface for use in 
executing computer applications; 

FIG. 2 is a schematic illustration of the natural language agent; 

FIG* 3 shows a simplified model of a traditional dialog manager for 
ordering pizza through an interactive system; 

FIG. 4 is a schematic illustration of the application class agent; 

FIG. 5 illustrates the mapping of natural language into a set of 
semantic tasks by each task agent; 

FIG. 6 illustrates an example of a personality assessment grid; 
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FIG. 7 illustrates an example of a computer system that can be utilized 
to execute the software of an embodiment of the invention and use hardware 
embodiments; and 

FIG. 8 illustrates a system block diagram of the computer system of 

5 FIG. 7. 

nFTAII.E D DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The present invention comprises a system and method for receiving 
commands and/or adaptively outputting results using a natural language 

10 speech interface, such as a natural language speech interface. The system and 

method are an agent-based architecture comprising a front-end natural 
language agent and an application class task agents for each class of 
applications. The system and method may include adapting to each user, 
including the user's speech pattern, the current or recent commands issued by 

15 the user and the user's preferences. The following description is presented to 

enable any person skilled in the art to make and use the invention. 
Descriptions of specific embodiments and applications are provided only as 
examples and various modifications will be readily apparent to those skilled in 
the art. The general principles defined herein may be applied to other 

20 embodiments and applications without departing from the spirit and scope of 

the invention. Thus, the present invention is to be accorded the widest scope 
encompassing numerous alternatives, modifications and equivalents consistent 
with the principles and features disclosed herein. For purpose of clarity, 
details relating to technical material that is known in the technical fields 
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related to the invention have not been described in detail so as not to 
unnecessarily obscure the present invention. 

Referring to the schematic illustration of FIG. 1, there is shown an 
adaptive natural or spoken language user interface system 100 for use in 
executing computer applications. The interface system 100 generally 
comprises a voice or front-end natural language agent 102 and one or more 
task agents 104a-d. As is generally shown, the user 106 communicates an 
input phrase, command or sentence 108 to the natural language agent 102 
which processes the input sentence and sends the input sentence to the 
appropriate one of the back-end application class task agents 104a-d. 
Examples of the task agents 104a-d shown in FIG. 1 are meeting agent 104a, 
personal information manager agent 104b, email agent 104c, and voice 
training agent 104d. Each task agent 104a-d outputs to the natural language 
agent 102 which then delivers the natural language output 110 to the user 106. 

Each of the back-end apphcation class task agents 104a-d works with a 
class of one or more existing computer applications. The interface system can 
be adapted to existing computer applications so that users can operate a 
computer using spoken language as well as other input devices such as 
keyboard and pointing devices, giving full multi-modal interface to existing 
computer applications. 

Although the natural language user interface system 100 is generally 
described as one interacting in spoken natural language, the system 100 may 
be configured to receive and/or output using one or more alternative input 
and/or output mechanisms while utilizing natural language for such input 
and/or output interactions. Suitable alternative modes of input and/or output 
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include keyboard, mouse, touch or contact sensitive screen, and/or screen 
display. 

FIG. 2 is a schematic illustration of the natural language agent 102. 
The natural language agent 102 communicates with the user 106 through 
5 spoken language. The natural language agent 102 preferably includes: 

• automatic speech recognition system 102a; 

• natural language syntactic parser 102b; 

• natural language semantic interpreter 102c; 

• agent communication manager 102d; 
10 • adaptive preference manager 1 02e; 

• dialog manager 102f; and 

• text-to-speech synthesizer 102g. 

The natural language agent 102 executes a first level interpretation of 
the natural language input. The front end natural language agent 102 receives 

15 all natural language input and deteraaines which of the available task agents 

104 to pass the natural language input interpreted by the front end natural 
language agent 102. The task agent 104 to which the natural language input 
was passed may return a response such as an output to the front end natural 
language agent 102. The front end natural language agent 102 then outputs 

20 the response from the particular task agent 104 to the user 106. The natural 

language agent 102 may itself return a response if it determines that the 
original natural language input is incomplete, incorrect, or otherwise cannot be 
properly interpreted. 

10 
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Each of these components 102a-g of the natural language agent 102 
are described in more detail below. 



Automatic Speech Recognition System 102a 

Automatic speech recognition systems for speech input are readily and 
commercially available off the shelf. Any suitable off the shelf speech 
recognition systems may be used as the automatic speech recognition system 
102a in the natural language interface system 100 of the present invention. 
Thus, details of speech recognition methods and systems are not described 
herein. In addition, error correcting techniques and cue words may be utilized 
to improve accuracy and allow for dialog management to effectively recognize 
speech input. 

Natural Language Syntactic Parser 102b 

There are generally three basic approaches to natural language 
syntactic processing: simple grammar, statistical and Govemment-and- 
Binding (GB-based). Simple grammar is used for simple, non-complicated 
syntax. Statistical approach examines word patterns and word co-occurrence 
and attempts to parse natural language sentences based on the likelihood of 
such patterns. Statistical approach uses a variety of methods such as neural 
networks and word distribution. The statistical approach is limited by an 
upper limit on error rate and it is very difficult to handle wide varieties of 
linguistic phenomena such as scrambling, NP (noun phrase) movement, 
binding between question words and empty categories. 
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The GB-based approach is described in, for example, "Some Concepts 
and Consequences of the Theory of Government and Binding," Cambridge, 
MA, MIT Press, the entirety of which is incorporated herein by reference. The 
GB-based approach is a more robust approach to natural language parsing 
5 using computational methods based on linguistic theory of a universal 

language. GB-based approach reveals implied syntactic structure in English 
language sentences and thus better facilitates in resolving ambiguous syntactic 
stmctures. By using generalized principles and parameters, the GB-based 
approach allows a customizable and portable parser that can be tailored to 

10 different enviroimients and languages with little modification. 

Preferably, the natural language syntactic parser 102b utilizes a GB- 
based principle and parameters framework to parse natural language computer 
commands. Hageman, L. Introduction to Government and Binding Theory, 
incorporated by reference herein, for example, describes this concept. With 

15 the generalized principles and parameters, GB-based approaches can describe 

a large syntax and vocabulary relatively easily and thus may result in higher 

1 . -.1. _ . - u nr:4.i. r^x> u.»»»^ — ««U ^»**«*^r%^r4r^ i-^ 

rOOUblllCbS Uiilll UUICI cippiuauiica. vv lUl <l vjJJ-uaow^J. oppiwciwi-L, wwiuxxacuxuo v\j 

computers can be seen as verb phrases that are a sub-set of complete English 
sentences. The sentences have an implied second person singular pronoun 
20 subject and the verb is active present tense. 

For example, to resume work on a previous project, the user 106 may 
state "show me the first message." This request would be parsed into the 
following structure: 

(VP (Vbar (V (VJP 
25 (V_IP show [present sg]) 
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(IP 

(NP (Nbar (N me [goal animate sg]))) 
(Ibar (NP [these inanimate sg] 
pet the) 

5 (Nbar 

(AP (Abar (A first))) 

(N message))))))))) . . 

The parse would allow the computer to map the verb into a computer 
command action, with the noun phrase (NP) as the object and the adjective 
10 phrase (AP) as properties of the object. 



Natural Language Semantic Interpreter 102c 

The natural language semantic interpreter or interpretation engine 102c 
is preferably a frame-based command interpretation system. The natural 
15 language semantic interpreter 102c may interpret the syntactic parse using 

context sensitive methodologies. The natural language semantic interpreter 

lUZC uses a JUlUWlcagC Uili>C pupuiatcu wiixi v^wii%..w.i-rfc-^x*^w*-i.**w^- 

application can handle. The natural language semantic interpreter 102c takes 
the syntactic parse of a spoken language request and maps it into a generic 
20 concept frame used to invoke the appropriate application method. TABLE I 



lists examples of concept-interfaces. 



1 TABLE I 


1 ACTION-CONCEPT 


TOPIC CONCEPT 


APPLICATION-CONTEXT 


Show 


Email 


Email application 


Show 


Email address 


Address book application 1 


Delete 


Email 


Email application 1 


Show 


General help 


Natural language agent | 
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Requests input to the computer are preferably transformed by the 
semantic interpretation engine 102c from a syntactic parse into a variable 
length verb-head frame. The process has variable length noxm phrases as 
argxmients. The noun phrases in turn have arguments that are adjective 
phrases. The verb-head describes an action-concept. The noun phrases 
describing the objects on which the actions are performed are topic-concepts 
and the adjective phrases describing the type of objects are modifier concepts. 

Reverse Grammar Generation Mechanism 

The semantic interpretation engine 102c may also include a reverse 
grammar generation mechanism. The reverse grammar generation mechanism 
may be implemented in each agent, i.e. the natural language agent and/or each 
of the task agents. The reverse grammar generation mechanism includes a Ust 
or vector for each word and corresponding probabilities for each word in the 
list. For example, for the word "I," "eye,*' or "aye," the associated vector or 
list includes those words, i.e. "I," "eye," and "aye," and may have 
corresponding probabiUties of 80%, 15%, and 5%. These probabilities may be 
predetermined and may be adjusted depending upon each user's selection of 
words used or depending upon a subset or all of the users' selection of words 
used. 

Upon receiving the syntactic parse of the spoken language request, the 
semantic interpretation engine 102c determines the permutations of the 
syntactic parse using the list for each word. For example, using the exemplary 
vector above and ignoring the lists for all the other words, if an input request is 
"I want to go home," the permutations of the syntactic parse may include: 



14 



wo 00/11571 



"PCT/US99/1^55 



"I want to go home," 
"Eye want to go home"; and 
"Aye want to go home." 
Using the peraiutations, the semantic interpretation engine 102 c 
5 determines which words best fit the grammar of the syntactic parse. To 

determine each best fit word, the word with the highest probability, i.e. "F' in 
the example above, is evaluated and determined if that word is suitable given 
the context. If the word is not suitable given the context, the next word having 
each next highest probability is then evaluated and determined if that word is 
10 suitable given the context, until a suitable word is determined. Of course, if 

no suitable word is determined, then the natural language agent may request 
clarification or correction firom the user. 

A combination of the action-concept and the topic-concepts are used to 
determine which task agent should handle the request. If the request is for a 
15 specialized task agent, the request is routed to that specialized task agent. If 

the request is for the natural language agent 102 itself, a routine associated 
with the command is invoked with the topics and modifiers as arguments. 
Using the arguments for routing commands allows for better disambiguation 
than with the verb alone. 
20 The above-described interpretation approach has the advantage of 

allowing the natural language agent 102 to query the user for clarification, for 
example, if the original request is incomplete, or otherwise cannot be properly 
interpreted by the natural language agent 102. For instance, if arguments do 
not match the verb, then a clarification can be requested by the natural 
25 language agent 102. 

15 
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Further, the above-described interpretation approach has the advantage 
of allow^ing the natural language agent 102 to properly interpret natural 
language requests without requiring the user to input the request conforming 
to specific structures. For example, in requesting an airhne ticket from 
Portland to Boston, the user may state "I'd like a ticket to Boston from 
Portland" or "I'd hke a ticket from Portland to Boston." In response, the 
natural language agent 102 may request clarification as to Portland, Oregon or 
Portland, Maine, for example. The above-described interpretation approach 
has the advantage of that it does not rely upon certain key words in order to 
properly interpret the user's requests. Further, the interpretation technique 
may be context base or context sensitive. 

Agent Communication Manager 102d 

The application class task agents 104 may communicate with each 
other, preferably using Knowledge Query Manipulation Language (KQML) or 
any other suitable language, via the agent communication manager or module 
102d. The contents of a message between appUcation class agents 104 may be 
coded in any suitable format, preferably the Knowledge Interchange Format 
(KEF). When a KQML message with an "achieve" performative is received by 
an agent 104, the KIF encoded concept structure is interpreted further by the 
agent 104 through a semantic interpretation knowledge base similar to the one 
described above with reference to the semantic interpreter 102c. In this case, 
the knowledge-base only includes information on how to map application- 
specific modifiers to application-task parameters. Using KQML and KIF 
allows the different agents 104 to easily communicate with each other. In 
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particular, the natural language agent 102 sends the user's request to the 
application class agent 104 via the agent communication manager 102d and 
the application class agent 104 sends requests back to the natural language 
agent 102, or some other agent, via the agent communication manager 102d. 
5 Thus an email class agent 104c can request information from a file manager 

class agent (not shown) using a KQML/KIF statement via the agent 
communication manager 102d. 

The above-described approach has the advantage of modularizing the 
different ontologies by allowing different application class agents 104 to have 
1 0 different subset dictionaries and task specific semantic interpretation 

knowledge-bases. It also allows the class agents 104 to handle vendor-specific 
application features by easily modifying the local semantic interpretation 
tables. This is described in more detail below with reference to the application 
class agent 104. 

1 5 The task routing mechanism is similar to Galaxy H, a voice controlled 

system integrating three separate voice controlled systems, as discussed in 
Seneff et al. "Galaxy-H: A Reference Architecture for Conversational System 
Development." 5th International Conference On Spoken Language Processing, 
Nov. 30 - Dec. 4, 1998, Sydney, AustraUa, p.931, the entirety of which is 

20 incorporated by reference herein. Currently, the Galaxy H requires a user to 

explicitly switch from one domain to another. 

Adaptive Preference Manager 102e 

The adaptive preference manager I02e is associated the natural 
25 language each agent 102 and with each user 106. The task of the adaptive 
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preference manager 102e is to learn what default conditions are preferred the 
user 106 by either monitoring the users action implicitly (i.e. observing in the 
background) and/or by being instructed explicitly by the user 106 on positive 
and/or negative preferences. These preferences may be shared by different 
users 106 running similar application class agents 104. 

The adaptive preference manager 102e uses relevance feedback 
techniques. Relevance feedback technique is widely used for preference 
optimization with declarative preferences. A request for executing an action 
based on preferences can be modeled as a query to locate a document in a 
collection of documents. In this technique widely used in information 
retrieval, the relevance of a document to a query is measured by how many 
matches the document has with query terms. In the realm of preference 
requests, the result of an action is analogous to a document where the 
preference is analogous to a query. Using this substitution, information 
retrieval techniques for ranking results of action requests can be adapted 
according to user preferences. Criteria specified in the spoken request are also 
factored as preferences. For preference matching, the information retrieval 
formula can be adapted for preference ranking by simplifying the equation for 
small queries as expressed in equation (1): 



similarity (Q, D) = 



(1) 



1/2 



1=1 1=1 



where 



t 



total number of independent terms 



(.5 + (.5 qfreqiq/maxfi-eqq)) x EDFj 
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Wij = dfreqij X EDFi 

qfireqiq = Frequency of term i in request q 

dfreqij = Frequency of term i in result j 

maxfreqj = Maximum frequency of any term in query 

maxfreqq 

IDFi = log2 (maxn/ni) + 1 

N = Number of results 

Hi = Total number of occurrences of term i in the 
results 

maxn Maximum frequency of any term in the results 



The qualitative ranking can be quantified by adding a set of weights to 
the ranking equations (2) and (3) as set forth below to incorporate the weights 
applied to terms in the definition of EDFi. 

maxn = max((^(w,. *«y))y/ (2) 
1=1 

Hi = 2:(w,*«,) (3) 

Relevance feedback techniques have been used in information retrieval 
techniques for improving the precision and recall of queries. In relevance 
feedback the query terms are reweighted by the selection of the retrieved items 
by the user. For the case where the user does not exhaustively select all the 
relevant responses, the reweighting of the term weights can be done by 
equations (4) and (5). 

Initial Weights Wiju = (C + IDFi) * fik (4) 
Feedback Wiju = (C + log pij (l-qij)/(l-Pij)qij) fik (5) 
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where: 

Wijk = weight for term i in preference j and result k 
IDFi = the IDF weight for the term i in the entire set of result 
= probability of the term i within the set of relevant results for 
preference j 

qij = probability that term i is assigned with the set of non relevant 
results for preference j 

fiu = K + (1 - K )* freqik/maxfreqic 

freqik = the frequency of term i in result k 

maxfreqk = maximum frequency of any term in result k. 

As noted above, the execution of a task with variable parameters can 
be modeled as an information retrieval query. In this case the weights for the 
query term can be modeled as the user's preference weights. 

Help System 

With a natural language based system that abstracts the semantic 
concepts from the complexity of tasks, much of the help system is implicitly 
encoded in the knowledge base. Instead of asking ^How can I send my 
spreadsheet to John", the user asks the natural language agent 102 to "Send the 
spreadsheet to John." If invalid parameters are given, the user 106 is 
prompted for the correct parameters. However, the natural language interface 
system 100 is also able to handle requests for help by generating an 
explanation of how the request functions. It can also show an example of a 
typical user request to accomplish the task. 
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Dialog Manager 102f 

The natural language agent 102 further includes a dialog manager 
102f. The dialog manager 102f of the natural language agent 102 controls the 
interactions between the user 106 and the natural language interface system 
5 100. The dialog manager 102f is an finite state machine (FSM) similar to the 

one described in Cohen, et al. "The Efficiency of Multimodal Interaction: A 
Case Study," 5th International Conference On Spoken Language Processing, 
Nov. 30 - Dec. 4, 1998, Sydney, Australia, p.253, incorporated herein in its 
entirety by reference. 

10 The dialog manager 102f handles tasks such as accepting user inputs, 

obtaining parameter for tasks, requesting clarification and asking for 
confirmation on tasks. 

The ability to handle natural language commands extends the concept 
of traditional dialog managers. Traditional dialog managers function like 
15 finite state machines (FSM) accepting dialog. For example, as shown in FIG. 

3, ordering a pizza through an interactive system requires the user to specify 
the type of pizza, such as the size and topping of the pizza. A simplified 
model may be adopted where the user must select the size of the pizza (small, 
medium, or large) and the toppings (cheese, Hawaiian or pepperoni), and 
20 confirm the order. If changes are to be made to the size while selecting the 

topping, then either this ability to make such a change must to be written into 
the FSM or the user must to wait until the end of the ordering sequence. 

In contrast, with spoken language commands, many of these dialog 
steps are unnecessary. Some FSMs can be generalized to a set of Boolean 
25 operations on a set of choices. In this case choosing a pizza is an AND 
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operation (size, topping and confirmation) on a set of XOR operations (e.g. 
small, medium or large size). Thus in spoken natural language, the user may 
simply say "I wish to order a large cheese pizza," 

As is evident, one natural language sentence completes all the choices 
and only a confirmation is necessary. However, additional dialog issues are 
created under various circumstances. For instance, the user may make an 
incomplete request such as "I want a cheese pizza," make an incorrect request 
such as "can you send veggie pizza," request information such as '^vhat types 
of pizza do you have," change a request "I'd like to make that a small one," or 
make an out of context request such as "I want to see my email." 

A global state variable may be introduced to allow the dialog manager 
102f the flexibiHty to handle such spoken language requests. The global state 
variable uniquely identifies the state of the interaction between the user 106 
and the natural language agent 102. The state of the natural language agent 
102 can be in one of two classes: IDLE or DEFINED. If the natural language 
agent 102 is in IDLE, the natural language agent 102 is not actively engaged 
in a dialog with the user 106 and interprets the request in the default global 
context. If the natural language agent 102 is in a DEFINED state SI, the 
designer of the dialog has the option of specifying a set of semantic fi-ames it 
will accept and the actions. If the semantic frame is not defined, the action 
would be deemed out of context. 

With the above-described scheme, if an incomplete request was made, 
the user 106 is prompted for more information; if an incorrect request is made, 
the user 106 is given a set of options from which to chose; and if a change 
request is made, the order is changed; if an out of context request is made. 
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then the user 106 is asked if a context switch is indeed desired with a warning 
that the current context will be lost. 



Text-To-Speech Synthesizer 102g 

The natural language agent 102 may offer the user 106 the option of 
receiving messages as text-on-screen or as synthesized speech with text-to- 
speech synthesizer 102g. The text-to-speech synthesizer 102g preferably uses 
commercial off the shelf technology to communicate messages to the user 106 
by speech. The text-to-speech synthesizer 102g may utilize intonation to 
make the synthesized speech sound more natural to the user 106. In addition, 
the natural language interface system 100 may use Avatars for output. The 
text and speech messages are transmitted in conjunction with other graphical 
items that may be displayed by the applications and/or the agents. 

Application Class Agent 104 

As shown in FIG. 4 and described above, the agent communication 
module 102d of the natural language agent 102 allows communication 
between the application class agents 104 and the natural language agent 102. 
Each application class agent 104 preferably works with a single class of 
applications 112 that have similar conceptual operations. For example, 
different email apphcations generally perform the same conceptual actions of 
sending and receiving mail but performs these actions through different sets of 
steps. 

Each application class agent 104 preferably includes a set of 
application v^appers 104 A, a semantic or task interpretation engine 104B, an 
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application class communication or dialog manager 104C, an adaptive 
application class preference manager 104D, and an application class help 
system (not shown). 

The communication between the application class agent 104 and each 
different type of vendor-specific applications programs 112 is via an 
application wrapper 104 A that translates the conceptual action to a set of 
application specific operations. The task application wrapper 104 A is the 
interface between the application class agent 104 and different applications 
112 in the class. With wrappers 104 A, the appHcation class agent 104 
communicates with specific applications 112 allowing the incorporation of 
existing apphcations into the architecture of the system 100. For example, an 
e-mail agent would have a wrapper for interacting with each email system 
such as NETSCAPE and MICROSOFT EXCHANGE. 



To interface with existing applications, the wrapper 104 A is preferably 
written in one of the platform specific macro languages. Examples of 
platform specific macro languages are listed in TABLE II. 



1 ABLE II 


PLATFORM 


MACRO LANGUAGE 


MICROSOFT WINDOWS/95/98/NT 


VISUAL TEST 


MICROSOFT COM Compliant applications 


MICROSOFT COM 


X WINDOWS Applications 


XTCL, XTK, PERL 


1 Applications with API 


API calls 



The task or semantic interpretation engine 104B is similar to the 
semantic interpretation engine 102c of the natural language agent 102 
described above. The task interpretation engine 104B serves as the knowledge 
base for each agent 104. The task interpretation engine 104B receives the 
semantic frame representation as input. Based on the frame's head verb 
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(action request) and noun phrases (parameters), the task interpretation engine 
104B invokes a routine that sends a set of requests to the task application 
wrapper 104A. 

The apphcation class dialog manager 104C is similar to the natural 
language agent dialog manager 102f of the natural language agent 102 
described above. The application class dialog manager 104C manages the 
interaction between the user 106 and the application class agent 104, requests 
clarification for ambiguous requests, asks for confirmation, and obtains 
incomplete parameters. 

The application class adaptive preference manager 104D records the 
user-preferences for each task. The preference is computed in a manner 
similar to the general natural language agent preference calculation for the 
natural language agent adaptive preference manager 102e as described above. 

While the natural language capability of the natural language interface 
system 100 desirably removes most of the user's need for help, each 
application class preferably has a help capability to enhance the minimum 
training feature of the natural language interface systeui 100 of the present 
invention. The help system can be encoded in the application class 
interpretation engine 104B such that the request will result in communications 
of instructions and explanation from the application class agent 104. For 
example, requests such as "How do I," "Can you show me," "What are the 
possible values for" will result in a response from the apphcation class agent 
102 with instructions and explanation on how to perform the task. 

The help system may provide various types of help information. The 
help system may provide description of the agent capabilities such as the 
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general uses of the application and the tasks the agent can perform. While the 
natural language interface system 100 is designed for unconstrained input, 
ambiguity resolution may require constraints in syntax and the help system 
may provide syntax for different tasks to the user 106. Thus, if the user 106 is 
imable to get the application class agent 104 to perform a task, the user 106 
may ask how to execute an operation. The help system can respond with a 
sample natural language sentence. In addition, the help system can also 
provide suitable parameter values and ranges as well as the typical general 
help information nomially included with the application on, for example, how 
to use the specific application 

Example: Address Book Agent 

The operation of the system 100 will be brief described with reference 
to an address book agent as an example. The address book agent comprises a 
task interpretation engine, a dialog manager and one or more task wrappers. 
The typical key actions of an address book include show (to display all or part 
of an address), change (to change all or part of an address), add (to add a new 
address), delete (to delete an existing address), sort (to arrange address by a 
given category), open/close (to open or close an address book), save (to save 
an address book), copy/paste (to copy and paste data from one part of an 
address book to another part). 

These actions can be interpreted by the address book agent with 
reference to a semantic frames knowledge-base. The frames are also inserted 
in the natural language agent routing table. An example of a listing of such 
frames is shown in TABLE III. An appUcation wrapper interfaces with the 
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particular address book application. The routines will handle the tasks as 
described above and will interface to the address book module for, e.g., 
NDCROSOFT EXCHANGE, and NETSCAPE. 

Semantic Mapping 

FIG. 5 schematically illustrates the mapping of the user's input phrase, 
command or sentence withiii a large set of syntactically correct natural 
language phrases, commands or sentences 140 into a set of semantic tasks or 
actions 142 by the semantic mapper 144. Preferably, a semantic mapper 144 
is provided for che natural language semantic interpreter 102c of the natural 
language agent 102 and/or the semantic interpretation engine 104B of each 
appUcation class agent. For example, a different semantic mapper 144 may be 
provided for word processing applications, e-mail applications and 



spreadsheet applications. TABLE IV provides an illustrative listing of task 
agents for a class of applications and a list of sample tasks corresponding to 
each task agent. 



TABLE III 


Action- 
Concept 
(Verb) 


Topic Concept 
(Main Noun 
Phrase) 


Second 
Noun 
Phrase 


Application- 
Context (State) 


Routine 


SHOW 


EMAIL 


JOHN 


EMAIL- 
APPLICATION 


Pointer to routine for 
showing John's email 


SHOW 


EMAIL- 
ADDRESS 


Current 


ADDRESS 
BOOK- 
APPLICATION 


Pointer to routine for 
showing current 
email 


DELETE 


EMAIL 


LAST 


EMAIL- 
APPLICATION 


Pointer to routine to 
delete last email 



Each task agent for a class of applications is preferably provided with 
its own set of semantically correct sentences, semantic actions and semantic 
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mapping. Each task agent thus serves as the common user interface for the 
corresponding class of applications imder the assumption that each appUcation 
within the class accomplishes the same or generally overlapping set of tasks. 
In other words, in a given class of applications, there is a finite and relatively 
small set of semantically equivalent actions or tasks 142 which may be 
performed by each application within the class. 



1 TABLE IV 


1 TASK AGENT 


SAMPLE TASKS 


Mail 


Sends, receives, composes and views email 


Fax 


Sends, receives, composes and views faxes 


Letter 


Composes, writes, and send letters 


File 


Manages files 


OS 


Manages OS, configuration, performance 


Address 


Manages Address books 


Games 


Plays and operates specific games j 


FUght Simulators 


Runs flight simulators 


Vehicle Simulators 


Operates Land Vehicle Simulators (cars, bikes, tanks, 
etc.) 


Naval Simulators 


Operates water-based simulators (ships, subs etc.) 


Sports Simulators 


Operates Baseball, Soccer, Football etc. simulators 1 


War Game & 
Strategy Simulators 


Uperates turn ana real ume-Dasea war games, single 
and multi-player 


Role Play Simulators 


Operates avatar-based role playing games such as 
DOOM, TOMB RAIDER, ADVENTURE, ZORK 


Action Simulators 


Operates action games 


PIM (personal 
1 information manager) 


General interface to task, calendar, address book, and 
notebook manager 


j Printer 


Selects and configxires printers, prints documents 


j Calendar 


Manages calendar, sets meetings and ticklers 


Terminal 


Connects to remote systems, logs on, logs off 


Travel 


Arranges travel 


Encyclopedia 


Searches for and displays information from 
encyclopedic Uterature 


Image Viewer 


Views and manipulates images 


C++ 


Assists in managing and writing C/C++ Programs 


Basic 


Assists in managing and writing Basic Programs 


GUI 


Manipulates, configures and arranges Graphical User 
Interface 


Presentation 


Draws, arranges and manipulates slide presentations 


1 Charting 


Charts sets of numbers in various graphs and charts 
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TABLE rV 


TASK AGENT 


SAMPLE TASKS 




Arranges and schedules meetings 


Scheduler 


Schedules tasks on the computer 


1 eiepnone 


Diak out and receives calls: integrates with address 
book 


V oice ividii 


Sends receives, nlavs, manaees voicemail 


woro r^roccssor 


Writes nrints. manioulates, formats documents 


Spreadsheet 


Writes, prints, manipulates, formats nvimerical data 


Drawing 


Tiraw*; manioulates formats diasrams, merges 
predrawn images 


Web 


Pnnnect*; navieates searches the internet/world- wide 
web 


NetworK 


r^rtnriF'rtc nptwrirlc*; mana^ies connections 


Mathematical 


Manages mathematical and scientific manipulation of 
numeric and formulaic data 


Directory Assistance 


T rt/^ot*»c tf*lf»r»Vmnp niimhers and addresses over the 
internet 


Internet Retail bales 


"r\<acr'riK*ac rkKif»rtc anH <:ell<i to customers over the 
internet 


Common Household 
utility Agent (e.g., VUK, 
Toaster, HVAC) 


Controls household devices 


K-12 Jiaucation on, e.g. 
Physics, Chemistry, Math 


T^aarVipc ninQ aame<? Quizzes etc for mathematics 
and science subjects 


oeneral Jiciucation on 
History, Economics, 
rniiosopny 


Tf*5irVi(»<5 Qnecific liberal arts and humanities courses 


Hands-on training for 
joo-oasea lasKs 


Trains users to operate equipment 


Internet Event lookup 


Locates events, such as conferences, meetings, 
concerts and festivals, through the internet 


Internet Product- 
Information lookup 


Finds products and prices through the internet 


Internet-based Meeting 
scheduler 


Schedules meetings over the internet 


Hardware manager 


Manager Computer Hardware (screen, disk, etc.) | 



For example, for the word processor class of applications, a user may 
input "compose a letter to John Smith," "please begin drafting a letter to John 
Smith," or "can you create a letter for my friend, John Smith?", each of which 
5 is a syntactically correct sentence within the large set of syntactically correct 

sentences 140. These user commands are semantically equivalent. In each 
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case, the semantic mapper 144 maps the user input to a specific action within 

the small set of semantic actions 142. In this example, the semantic mapper 

144 maps each of these user inputs to the same action, draft a letter to John 

Smith, and the same task is performed. Thus, the semantic mapper 144 

ensures that the same task is performed in a given class of applications 

regardless of the specific user input. 

Each application in the class may have a different method for 

accomplishing the same semantic task. In response to any of the user inputs in 

the example above, a word processor application composes or drafts a letter to 

John Smith although the particular word processor appUcation may utilize an 

approach different from the approach used by another word processor 

application. By using a core set of semantically equivalent tasks 142 for each 

class of applications, the present invention allows the user to accompUsh the 

same semantic task independent of the specific application utilized. 

Although a single task agent is preferably provided for each class of 

applications, the task engine of each task agent includes a process-specific 

execution module for each application. For example, the word processing task 

agent may include an execution module for MICROSOFT WORD and another 

execution module for WORD PERFECT. The process-specific execution 

module translates the semantic action for the specific application. 

The semantic mapper 144 is capable of reducing idiomatic sentences 

and output a mapped semantic action. Input sentences may be generally 

classified as wh-query, request, let constmction, infinitive, embedded clause, 

semantic mappings and context-dependent. Examples of the classifications of 

input sentences are shown in Table V. Regardless of the classification of the 
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input sentence, each input sentence is mapped into a semantic action. 
Preferably, each mapped semantic action is in the fomi of a verb phrase or the 
imperative case with an implied non-phrase. "Show me mail message" is an 
example of a imperative verb phrase having "yo^"' as the impUed non-phrase. 



TABLE V 



Wh-query 

What I wouldn't give to see the back of a blue car 
What is stopping you from showing me my mail 
What would I give to see my mail 

Why don't you show me my mail How is my email situation today 

Why can't I see my mail Why don't you clean my mailbox 



Request 

Can I see my mail 

Can you show me my mail 

Can you like show me my mail 


May I see may mail 

Would you allow me to see may mail 

Will you show me my mail 


Let construction 

Let me see my mail 
Let me mail be shown 


Let me mail be seen by me 
Let me know when I get mail 


Infinitive 

I want to check out my mail 
I would love to check out my mail 
We want to see our mail now 
I need to see my mail now 


I want to know if I got new mail 
I want to access my mail 
He wants to see his mail 
Susan wants to see her mail 



Embedded clause 

I will be upset if you do not show me mail right now 

I would really appreciate it if you could show me my mail 

I think it would be great if you could show me my mail 

I want you to tell me if I have new mail 

I'm looking to see whether I have new mail 

we would like it if you could show us our mail 

I want you to inform me if I have new mail 



Semantic mappings 

I would like for you to give me a view where I can see my mail 
I would love it if I could play with my mail 
Can you show me something like mail 

Can you show me what evertone has sent to me 



Context-dependent 

Let me see what you have I want you to summarize these 

What else can you do Who sent the last one 

I want to know all about the things you can do 
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In addition to the various input sentences, the spoken input sentences 
108 given by the user 106 may contain one or more of several types of errors 
which may occur. These errors include unrecognized word, bad parse, 
unhandled verb, unhandled object, unhandled verb/object attribute and/or task- 
specific error. Some errors may be better handled and addressed at the natural 
language agent 102 while other errors may be better handled and addressed at 
the appropriate task agent 104. For example, errors relating to unrecognized 
word, bad parse and tmhandled verb are preferably handled and addressed at 
the natural language agent 102. Errors relating to unhandled object may be 
handled and addressed at either the natural language agent 102 or the task 
agent 104. Further, errors relating to unholdled verb/object attribute and task- 
specific errors are preferably handled and addressed at the task agent 104. 

As discussed above, the interface 100 of the present invention is an 
adaptive natural language interface 100. The output of the natural language 
agent 102 is preferably adaptive to the personality of the user 106 by first 
identifying the personality type, personality trait or characteristics of the user 
and utilizing that identification for responding to the user. FIG. 6 illustrates 
an example of a personality assessment grid where users may be groups into 
one of four types: analytical, driver, amiable and expressive, which are 
defined depending upon the relative levels of assertiveness and 
responsiveness. The natural language agent may make a determination as to 
which of the four types best characterizes the user fi-om factors such as the 
user's tone, pitch, speed and the actual words used by the user. Of course, the 
natural language agent may utilize any other factors, personality assessment 
methods and/or personality characterization schemes. 
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The natural language agent 102 is adaptive in that it utilizes that 
determination of the user 106 in responding to the user by deUvering the 
output to the user or in requesting additional information from the user using 
simpHfied emotional response. The deteraiination thus may affect the tone, 
5 pitch, speed and/or the actual words used to respond to the user. For example, 

in delivering the output to the user or in requesting additional information 
from the user, the natural language agent may be empathetic, for example, and 
express similar levels of assertiveness and/or responsiveness by varying the 
words used, the speed at which the words are delivered, the tone and/or the 
10 pitch of the words. In addition, the form of as well as the specific graphical 

interface seen by the user may be determined by the user, the application 
currently utilized and/or based upon the determination of the user's 
personality. 

Although the adaptive natural or spoken language user interface 
15 system 100 is described above in terms of natural language speech input, the 

interface system can also recognize and interpret natural language non-speech 
command, such as text. The natural language interface is preferably embodied 
in a computer program product in the form of computer coded instructions 
executable by a computer processor and stored in a computer readable 

20 medium. 

FIG. 7 illustrates an example of a computer system that can be utilized 
to execute the software of an embodiment of the invention and use hardware 
embodiments. FIG. 7 shows a computer system 201 that includes a display 
203, screen 205, cabinet 207, keyboard 209, and mouse 211. Mouse 211 can 
25 have one or more buttons for interacting with a GUI. Cabinet 207 houses a 
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CD-ROM drive and/or a floppy disc drive 213, system memory and a hard 
drive (see FIG. 8) which can be utilized to store and retrieve software 
programs incorporating computer code that implements aspects of the 
invention, data for use with the invention, and the like. Although a CD-ROM 
and a floppy disk 215 are shown as an exemplary computer readable storage 
medium, other computer readable storage media including magnetic tape, 
flash memory, system memory, RAM, other types of ROM, and hard drive can 
be utilized. Additionally, a data signal embodied in a carrier wave (e.g., in a 
network including the Internet) can be the computer readable storage medium. 

FIG, 8 shows a system block diagram of computer system 201 used to 
execute a software of an embodiment of the invention or use hardware 
embodiments. As in FIG. 7, computer system 201 includes monitor 203 and 
keyboard 209, and mouse 211. Computer system 201 further includes 
subsystems such as a central processor 251, system memory 253, fixed storage 
255 (e.g., hard drive), removable storage 257 (e.g., CD-ROM drive), display 
adapter 259, soimd card 261, transducers 263 (speakers, microphones, and the 
like), and network interface 265. Other computer systems suitable for use 
with the invention can include additional or fewer subsystems. For example, 
another computer system could include more than one processor 251 (i.e., a 
multi-processor system) or a cache memory. 

The system bus architecture of computer system 201 is represented by 
arrows 267. However, these arrows are illustrative of any interconnection 
scheme serving to link the subsystems. For example, a local bus could be 
utilized to connect the central processor to the system memory and display 
adapter. Computer system 201 shown in FIG. 8 is but an example of a 
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computer system suitable for use with the invention. Other computer 
architectures having different configurations of subsystems can also be 
utilized. 

While the preferred embodiments of the present invention are 
5 described and iUustrated herein, it will be appreciated that they are merely 

illustrative and that modifications can be made to these embodiments without 
departing from the spirit and scope of the invention. Thus, the invention is 
intended to be defined only in terms of the following claims. 
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CLAIMS 

What is claimed is: 

1 . A natural language interface for a computer system that interprets 
natural language user input, the natural language interface comprising: 
5 a natural language agent adapted to receive and interpret the 

natxiral language user input and adapted to output an output conunand; and 

at least one application agent adapted to receive and fiirther 
interpret the output command from the natural language agent and adapted to 
output an executable instruction to an application program, 
10 the natural language agent including: 

a syntactic parser adapted to generate a parsed sentence from 
the natural language user input, 

a semantic interpreter adapted to generate the output 
command from the parsed sentence, and 
15 an agent communication manager adapted to provide 

conununication between the semantic interpreter and the at least one 
application agent, 

each of the at least one application agent including: 

a semantic task interpreter adapted to generate the executable 
20 instruction from the output command of the natural language agent, and 

at least one application wrapper, each wrapper configured to 
communicate with a corresponding application program. 
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2. The natural language interface of claim 1 , wherein the semantic 
interpreter of the natural language agent includes a semantic mapper adapted 
to map the parsed sentence to a semantic action as the output command. 

5 3. The natural language interface of claim 1 , wherein the natural 

language agent further includes a speech recognition system adapted to receive 
and recognize natural language user speech input and to generate the natural 
language user request therefrom. 

10 4. The natural language interface of claim 1 , wherein the natural 

language agent further includes a dialog manager adapted to provide feedback 
to the user indicating the natural language agent's understanding of the natural 
language user input and to interact with the user in natural language to clarify 
the natural language user input as necessary. 

15 

5. The natural language interface of claim 4, wherein the natural 
language agent further includes a text to speech synthesizer adapted to provide 
the speech feedback to the user in speech. 

20 6. The natural language interface of claim 1 , wherein the natural 

language agent further includes an adaptive preference manager adapted to 
generate default conditions preferred by the user, the default conditions being 
specific to each user and/or common to multiple users. 
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7. The natural language interface of claim 1, wherein the semantic 
task interpreter of each application agent further includes a semantic mapper 
for mapping the output command to a semantic action as the executable 
instruction. 

5 

8. The natural language interface of claim 1, wherein each of the at 
least one application agent further includes a dialog manager adapted to 
provide natural language feedback to the user indicating the application 
agent's understanding of the natural language user input and to interact with 

10 the user in natural language to clarify the natural language user input as 

necessary. 

9. The natural language interface of claim 1 , wherein each of the at 
least one application agent further includes an adaptive preference manager 

15 adapted to generate default conditions preferred by the user for the specific 

application, the default conditions being specific to each user and/or common 
to multiple users. 
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10. The natural language interface of claim 1, comprising one of the 
at least one application agent for each class of application programs, each 
class of apphcation programs being selected from the group consisting of 
electronic mail, fax, letter, file, operating system, address, games, flight 
simulators, vehicle simulators, naval simulators, sports simulators, war game 
and strategy simulators, role play simulators, action simulators, personal 
infomiation manager, printer, calendar, terminal, travel, encyclopedia, image 
viewer, C-1-+, Basic, graphical user interface, presentation, charting, meeting, 
scheduler, telephone, voice mail, word processor, spreadsheet, drawing, web, 
network, mathematical, directory assistance, internet retail sales, common 
household utility agent, K-12 education, general education, hands-on training 
for job-based tasks, internet event lookup, internet product-information 
lookup, internet-based meeting scheduler, and hardware manager. 

11. A computer readable medium on which are stored instructions 
executable on a computer processor, the instructions comprising: 
receiving natural language user input; 
generating a parsed sentence from the natural language user 

input, 

mapping the parsed sentence into a semantic action; and 
generating an instmction from the semantic action, the instmction 
being executable by an application. 

12. The computer readable medium of claim 1 1 , wherein receiving 
natural language user input includes receiving natural language speech input. 
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1 3 . The computer readable medium of claim 1 1 , the instructions 

further comprising: 

providing feedback to the user indicating the processor's 
5 understanding of the natural language user input; and 

interacting with the user in natural language to clarify the natural 
language user input as necessary. 

1 4 . The computer readable medixun of claim 1 3 , wherein providing 
10 feedback to the user includes providing speech feedback to the user. 



1 5 . The computer readable medium of claim 11 , the instructions 
further comprising generating a set of default conditions for executing the 
instmction by an appUcation, the default conditions being specific to each user 
15 and/or common to multiple users. 
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1 6. The computer readable medium of claim 1 1 , wherein the 
application is one of one or more applications selected from the group 
consisting of electronic mail, fax, letter, file, operating system, address, 
games, flight simulators, vehicle simulators, naval simulators, sports 
simulators, war game and strategy simulators, role play simulators, action 
simulators, personal information manager, printer, calendar, terminal, travel, 
encyclopedia, image viewer, C++, Basic, graphical user interface, 
presentation, charting, meeting, scheduler, telephone, voice mail, word 
processor, spreadsheet, drawing, web, network, mathematical, directory 
assistance, internet retail sales, common household utility agent, K-12 
education, general education, hands-on training for job-based tasks, internet 
event lookup, internet product-information lookup, internet-based meeting 
scheduler, and hardware manager. 



1 7 . The computer readable medium of claim 1 1 , wherein the 
computer readable medium is selected from the group consisting of CD-ROM, 
zip disk, floppy disk, tape, flash memory, system memory, hard drive, and 
data signal embodied in a carrier wave. 
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18. A method for receiving, interpreting and executing natural 
language user input, comprising: 

receiving natural language user input; 

generating a parsed sentence from the natural language user 

5 input, 

semantically interpreting the parsed sentence and generating an 
output command from the parsed sentence, 

outputting the output command to an application class agent, 
semantically interpreting the output command and generating an 
10 executable instmction from the output command, and 

outputting the executable instruction to an application program 
for execution by the application program. 



19. The method for receiving, interpreting and executing natural 

1 5 language user input of claim 1 8, wherein receiving natural language user input 

includes receiving natural language speech input, 

20. The method for receiving, interpreting and executing natural 
language user input of claim 18, further comprising: 

20 providing feedback to the user indicating the processor's 

understanding of the natural language user input; and 

interacting with the user in natural language to clarify the natural 
language user input as necessary. 
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2 1 . The method for receiving, interpreting and executing natural 
language user input of claim 20, wherein providing feedback to the user 
includes providing speech feedback to the user. 

22. The method for receiving, interpreting and executing natural 
language user input of claim 18, further comprising generating a set of default 
conditions for executing the instmction by an application, the default 
conditions being specific to each user and/or common to multiple users. 

23. The method for receiving, interpreting and executing natural 
language user input of claim 18, wherein semantically interpreting the parsed 
sentence and generating the output command includes mapping the parsed 
sentence into a semantic action as the output command. 

24. The method for receiving, interpreting and executing natural 
language user input of claim 18, wherein semantically interpreting the output 
command and generating the executable instruction includes mapping the 
output command into a semantic action as the executable instruction. 
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25. The method for receiving, interpreting and executing natural 
language user input of claim 18, wherein the ^plication is one of one or more 
applications selected from the group consisting of electronic mail, fax, letter, 
file, operating system, address, games, flight simulators, vehicle simulators, 
5 naval simulators, sports simulators, war game and strategy simulators, role 

play simulators, action simulators, personal information manager, printer, 
calendar, terminal, travel, encyclopedia, image viewer, Chh-, Basic, graphical 
user interface, presentation, charting, meeting, scheduler, telephone, voice 
mail, word processor, spreadsheet, drawing, web, network, mathematical, 
10 directory assistance, internet retail sales, common household utility agent, K- 

12 education, general education, hands-on training for job-based tasks, internet 
event lookup, internet product-information lookup, internet-based meeting 
scheduler, and hardware manager. 
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